Distributed Real-Time Fault Tolerance on a Virtualized Multi-Core System

نویسندگان

  • Eric Missimer
  • Richard West
  • Ye Li
چکیده

This paper presents different approaches for real-time fault tolerance using redundancy methods for multi-core systems. Using hardware virtualization, a distributed system on a chip is created, where the cores are isolated from one another except through explicit communication channels. Using this system architecture, redundant tasks that would typically be run on separate processors can be consolidated onto a single multi-core processor while still maintaining high confidence of system reliability. A multi-core chip-level distributed system could therefore offer an alternative to traditional automotive systems, for example, which typically use a controller area network such as CAN bus to interconnect multiple electronic control units. Using memory as the explicit communication channel, new recovery techniques that require higher bandwidths and lower latencies than those of traditional networks, now become viable. In this work, we discuss several such techniques we are considering in our chip-level distributed system called Quest-V.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Replication and Resubmission Based Adaptive Decision for Fault Tolerance in Real Time Cloud Computing: A New Approach

Cloud computing an adoptable technology is the upshot evolution of on demand service in the computing epitome of immense scale distributed computing. With the raising asks and welfares of cloud computing infrastructure, society can take leverage of intensive computing capability services and scalable, virtualized vicinity of cloud computing to carry out real time tasks executed on a remote clou...

متن کامل

A Fault Observant Real-Time Embedded Design for Network-on-Chip Control Systems

Performance and time to market requirements cause many realtime designers to consider components, off the shelf (COTS) for real-time systems. Massive multi-core embedded processors with network-on-chip (NoC) designs to facilitate core-to-core communication are becoming common in COTS. These architectures benefit real-time scheduling, but they also pose predictability challenges. In this work, w...

متن کامل

Multi-Layer Fault Tolerance for Distributed Real-Time Systems

This thesis addresses issues in building fault-tolerant distributed real-time systems. Such systems are increasingly deployed in automotive and avionics applications. We focus on the design and validation of fault tolerance mechanisms. From the design viewpoint, we develop the notion of multi-layer fault tolerance. A fault-tolerant distributed system contains a set of mechanisms that provide er...

متن کامل

Fault Tolerance in Real Time Distributed System

In this paper we investigate the different techniques of fault tolerance which are used in many real time distributed systems. The main focus is on types of fault occurring in the system, fault detection techniques and the recovery techniques used. A fault can occur due to link failure, resource failure or by any other reason is to be tolerated for working the system smoothly and accurately. Th...

متن کامل

Study and Simulation of a Distributed Real-time Fault-tolerance Web Monitoring System

The goal of this project is to study and simulate a distributed real-time fault-tolerance web monitoring system. The method of providing fault-tolerance is to schedule multiple copies of a task on different computer nodes in a distributed computing system. A fault-tolerant system automatically recovers from a specified number of failures. If the primary task cannot be completed due to a fault, ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013